fix the issue of GEMM validation failure#378
fix the issue of GEMM validation failure#378zhangnju wants to merge 3 commits intoROCm:amd-stagingfrom
Conversation
|
cc @neon60 @j-stephan @adeljo-amd I wonder if this example overlaps with matrix multiplication from #375? If they are similar enough, we should probably just keep one. |
|
I think these two examples have different kernels ,which should not be redundant |
zichguan-amd
left a comment
There was a problem hiding this comment.
I'm OK with the change, the CPU/GPU error should be in the same range, we can definitely use double for more precision. I'll let others weigh in.
| constexpr float b_value = 0.02F; | ||
| std::fill(B.begin(), B.end(), b_value); | ||
| // Set matrix elements to random value on the host. | ||
| for (size_t i = 0; i < A.size(); ++i) A[i] = static_cast<float>(rand() / RAND_MAX ); |
There was a problem hiding this comment.
Should be static_cast<double>(rand()) / RAND_MAX, static_cast<float>(rand() / RAND_MAX ) would result in 0 most of the time (integer division)
There was a problem hiding this comment.
Thanks for your reminding. we can change it to be "static_cast(rand() / (RAND_MAX+1.0f) );", and it will generate [0,1) random float.
| std::fill(B.begin(), B.end(), b_value); | ||
| // Set matrix elements to random value on the host. | ||
| for (size_t i = 0; i < A.size(); ++i) A[i] = static_cast<float>(rand() / RAND_MAX ); | ||
| for (size_t i = 0; i < B.size(); ++i) B[i] = static_cast<float>(rand() / RAND_MAX ); |
There was a problem hiding this comment.
I have updated it to be "static_cast(rand() / (RAND_MAX+1.0f) )", and verified that it can generate [0,1) random float value
| #include <cstdlib> | ||
| #include <cassert> | ||
| #include <cstddef> | ||
| #include <memory> |
There was a problem hiding this comment.
I have removed them in the new commit
Motivation
when trying GEMM sample on R9700, if I change A_rows,A_cols,B_cols from default value to be 4096, validation will fail
Technical Details
Test Plan
./hip_matrix_multiplication
Matrix multiplication: [2048x1024] * [1024x1024], block size: 16x16
Validation passed.
./hip_matrix_multiplication --A_rows 4096 --A_cols 4096 --B_cols 4096
Matrix multiplication: [4096x4096] * [4096x4096], block size: 16x16
Validation passed.
./hip_matrix_multiplication --A_rows 4096 --A_cols 512 --B_cols 2048
Matrix multiplication: [4096x512] * [512x2048], block size: 16x16
Validation passed.
Test Result
all the test can pass
Added/Updated documentation?
Included Visual Studio files?
Submission Checklist